The Scalable Coherent Interface (SCI) - IEEE Communications Magazine

نویسنده

  • David B.
چکیده

There is rapidly increasing demand for very-high-performance networked communication for workstation clusters, distributed databases, multiprocessors, industrial data acquisition and control systems, shared access to distributed data, and so on. Higherbandwidth hardware using the traditional protocols is not sufficient. Even at 100 Mb/s, and certainly at 250 Mb/s, throughput for many applications is so limited by delays due to architecturally induced inefficiencies, such as software overheads (often hundreds of microseconds), that higher bandwidth generally raises cost without improving performance. A new approach to communication is required, one that can eliminate the delay due to software overheads, if we are to reap the full benefit of the far higher bandwidths that modern hardware can provide SCI solves this problem by using the distributed-shared-memory paradigm, typically offering submicrosecond delays and bandwidths currently in the range of 1250 to 8000 Mbis per network node. This article first reviews the general properties that an appropriate system architecture should have, and introduces an architectural model, the Local Area MultiProcessor, distinguished by i t s shared-memory performance and i ts ability to handle LAN-style distances These desired properties are then considered in more detail, and practical design decisions are made, illustrated by the evolution of the ISO/ANSI/IEEE standard Scalable Coherent Interface (SCI) as it addressed these issues. Finally, the current status of the various SCI follow-on and support projects is reported. The Scalable Coherent Interface (SCI) David B. Gustavson and Qiang ti, Santa Clara University he technology described in this article arose from a project whose goal was a standard interconnect that would enable the use of many mass-produced microprocessors for supercomputer-class computation at low cost. The usefulness of this technology for local area network (LAN) communication was an unexpected consequence of the scalability requirements of that project, so the technology is described in language appropriate for high-speed buses or memory systems as well as for LANs. This work is closely related to the IEEE Std 1394 Serial Bus, and to the IEEE P1394.2 Serial Express project, which has just started. This technology is contrasted with Serial Bus and Serial Express later in this article. The supercomputing goal demands communication among multiple processors at the speed of main memory systems, with submicrosecond latency and multigigabyte-per-second bandwidths. There is no time for software-implemented protocols in such a system, so the communication model must be kept extremely simple: load from and store to memory that is in a single flat address space, but physically distributed among the processors. The physical-layer signaling mechanisms also must be kept very simple so that the interface circuits can run at very high speed without becoming expensive. However, this memory-like model is very general, so once the physical links proved capable of LAN distances the distributed-memory model became capable of LAN applications, backward-compatible with traditional protocols as needed during the transition to this new technology, yet fundamentally more efficient, able to utilize large numbers of gigabyteper-second communication links effectively. The use of the distributed memory model for networking is not new [l, 21, but these schemes had neither the very high bandwidths nor the support for caching provided by the Scalable Coherent Interface (SCI) technology. In the following, we return to the original motivation for developing the SCI (local area multiprocessor); its application to networking will be discussed in the fourth section. To meet the ever-increasing demand for computing power and the ever-decreasing expectation for its cost, the use of highly parallel multiprocessor systems is essential. Only a few of the traditional few-processor supercomputer companies have survived, and they are now adding multiprocessor-based product lines. It has been obvious for a long time that the cost effectiveness of the microprocessor is unexcelled, and microprocessor performance continues to grow exponentially. The difficulty has been learning how to solve hard problems by using many inexpensive processors instead of one extremely powerful (and expensive) processor. The Scalable Coherent Interface is an interconnect technology designed to be scalable and cost-effective. The objectives of the interconnection can be summarized as follows: We seek a scalable abstract specification of the interface to communicating components that are approximately at the level of microprocessors (intelligent I/O devices, memory, bridges to other systems, networks, etc.) so that these components can be assembled into more complex and capable multiprocessor systems. We require support for efficient multiprocessing using both the message-passing and distributed-shared-memory models. Because multiple cached copies of data will be present in most systems, we require this interface to include a cache coherence mechanism to keep these copies consistent. HISTORY AND BACKGROUND he Scalable Coherent Interface was developed by a number T of high-performance bus designers and system architects who had come to understand the fundamental limits to bus technology during their work on Fastbus (IEEE 960) and Futurebus+ (IEEE 896.x). These contemporary buses pushed bus signaling technology to its limits, and provided various architectural features that support the use of multiple processors. However, it was recognized that very soon microprocessor speeds would exceed the capability of any bus to support significant multiprocessing, and our efforts of the preceding decade were fated to a short life. A study group was organized in 1987 to look for some way out of this catastrophe. This Superbus Study Group met from November 1987 through July 1988, examining alternative approaches for providing buslike services while avoiding bus limitations. Gradually the form that the solution would require, if indeed a solution were possible, became clear. In July 1988 an official IEEE Working Group was chartered. The group was very fortunate to have several very talented and experienced core members who were available essentially full-time. Initially all the work52 0163-6804l96l$05.00

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

First Experience with the Scalable Coherent Interface

The research project RD24 [11] is studying applications of the Scalable Coherent Interface (IEEE-1596) standard for the large hadron collider (LHC). First SCI node chips from Dolphin were used to demonstrate the use and functioning of SCI’s packet protocols and to measure data rates. We present results from a first, two-node SCI ringlet at CERN, based on a R3000 RISC processor node and DMA node...

متن کامل

The Performance of SCI Multiprocessor Rings

The Scalable Coherent Interface (SCI) is an IEEE standard that deenes a hardware platform for scalable shared-memory multiprocessors. This paper contains a quantitative performance evaluation of an SCI-connected multi-processor that assesses both the communication and cache coherence subsystems. For the architecture and workload simulated, it was found that the largest eecient ring size is eigh...

متن کامل

Hardware Support for Synchronization in the Scalable Coherent Interface (SCI)

The exploitation of the inherent parallelism in applications depends critically on the eeciency of the synchronization and data exchange primitives provided by the hardware. This paper discusses and analyses such primitives as they are implemented in a pending IEEE standard 1596 for communication in a shared memory multiprocessor, the Scalable Coherent Interface (SCI). The SCI synchronization p...

متن کامل

Non-Intrusive Deep Tracing of SCI Interconnect Traffic

The Scalable Coherent Interface (SCI) is one of the enabling interconnect technologies for high performance computing on PC Clusters. Trinity College Dublin has designed and is currently prototyping a trace instrument that allows deep traces of SCI interconnect traffic. Such an instrument is essential for a detailed spatial and temporal analysis of parallel executed algorithms on loosely couple...

متن کامل

Biologically inspired solutions to fundamental transportation problems

Traffic congestion in urban areas is an acute problem which is getting worse with the increased urbanization of the world population. The existing approaches to increasing traffic flow in urban areas have proven inefficient as they are expensive and, therefore, not scalable. It is shown in this paper that a biologically inspired new approach could solve some of the fundamental transportation pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004